14 research outputs found

    On the aperiodic avoidability of binary patterns with variables and reversals

    Get PDF
    In this work we present a characterisation of the avoidability of all unary and binary patterns, that do not only contain variables but also reversals of their instances, with respect to aperiodic infinite words. These types of patterns were studied recently in either more general or particular cases

    Contextual partial commutations

    Get PDF
    We consider the monoid T with the presentation which is "close" to trace monoids. We prove two different types of results. First, we give a combinatorial description of the lexicographically minimum and maximum representatives of their congruence classes in the free monoid {a; b}* and solve the classical equations, such as commutation and conjugacy in T. Then we study the closure properties of the two subfamilies of the rational subsets of T whose lexicographically minimum and maximum cross-sections respectively, are rational in {a; b}*. © 2010 Discrete Mathematics and Theoretical Computer Science

    A note on Thue games

    Get PDF
    In this work we improve on a result from [1]. In particular, we investigate the situation where a word is constructed jointly by two players who alternately append letters to the end of an existing word. One of the players (Ann) tries to avoid (non-trivial) repetitions, while the other one (Ben) tries to enforce them. We show a construction that is closer to the lower bound showed in [2] using entropy compression, and building on the probabilistic arguments based on a version of the Lov´asz Local Lemma from [3]. We provide an explicit strategy for Ann to avoid (non-trivial) repetitions over a 7-letter alphabet

    Contextual partial commutations

    Get PDF
    We consider the monoid T with the presentation which is "close" to trace monoids. We prove two different types of results. First, we give a combinatorial description of the lexicographically minimum and maximum representatives of their congruence classes in the free monoid {a; b}* and solve the classical equations, such as commutation and conjugacy in T. Then we study the closure properties of the two subfamilies of the rational subsets of T whose lexicographically minimum and maximum cross-sections respectively, are rational in {a; b}*. © 2010 Discrete Mathematics and Theoretical Computer Science

    Revisiting Shinohara's algorithm for computing descriptive patterns

    Get PDF
    A pattern α is a word consisting of constants and variables and it describes the pattern language L(α) of all words that can be obtained by uniformly replacing the variables with constant words. In 1982, Shinohara presents an algorithm that computes a pattern that is descriptive for a finite set S of words, i.e., its pattern language contains S in the closest possible way among all pattern languages. We generalise Shinohara’s algorithm to subclasses of patterns and characterise those subclasses for which it is applicable. Furthermore, within this set of pattern classes, we characterise those for which Shinohara’s algorithm has a polynomial running time (under the assumption P 6= N P). Moreover, we also investigate the complexity of the consistency problem of patterns, i.e., finding a pattern that separates two given finite sets of words

    Pattern matching with variables: Efficient algorithms and complexity results

    Get PDF
    A pattern α (i. e., a string of variables and terminals) matches a word w, if w can be obtained by uniformly replacing the variables of α by terminal words. The respective matching problem, i. e., deciding whether or not a given pattern matches a given word, is generally NP-complete, but can be solved in polynomial-time for restricted classes of patterns. We present efficient algorithms for the matching problem with respect to patterns with a bounded number of repeated variables and patterns with a structural restriction on the order of variables. Furthermore, we show that it is NP-complete to decide, for a given number k and a word w, whether w can be factorised into k distinct factors. As an immediate consequence of this hardness result, the injective version (i. e., different variables are replaced by different words) of the matching problem is NP-complete even for very restricted clases of patterns

    Circular sequence comparison: algorithms and applications

    Get PDF
    Background: Sequence comparison is a fundamental step in many important tasks in bioinformatics; from phylogenetic reconstruction to the reconstruction of genomes. Traditional algorithms for measuring approximation in sequence comparison are based on the notions of distance or similarity, and are generally computed through sequence alignment techniques. As circular molecular structure is a common phenomenon in nature, a caveat of the adaptation of alignment techniques for circular sequence comparison is that they are computationally expensive, requiring from super-quadratic to cubic time in the length of the sequences. Results: In this paper, we introduce a new distance measure based on q-grams, and show how it can be applied effectively and computed efficiently for circular sequence comparison. Experimental results, using real DNA, RNA, and protein sequences as well as synthetic data, demonstrate orders-of-magnitude superiority of our approach in terms of efficiency, while maintaining an accuracy very competitive to the state of the art

    Clusters of repetition roots forming prefix chains

    No full text
    We investigate lower bounds on the size of clusters (sets of starting positions of occurrences) of common prefixes shared by repetition roots. Such lower bounds in terms of the constituent roots in the sets provide upper bounds on the number of distinct repetitions. In the case of distinct square roots which are totally ordered by the prefix relation it has been shown that there must be more occurrences of the common prefix than the number of roots. Here we develop the theory further by presenting the tools to extend the bounds to exponents higher than 2 and we show that they are optimal in the sense that any sequence of cluster sizes satisfying the lower bounds can be realized. We also take the next step towards the bounds on arbitrary (only partially prefix-ordered) sets of roots by proving a lower bound on unbordered prefixes shared by two overlapping prefix chains of roots

    Sweep complexity revisited

    No full text
    We study the sweep complexity of DFA in one-way jumping mode answering several questions posed earlier. This measure is the number of times in the worst case that such machines have to return to the beginning of their input after having skipped some of the symbols. The class of languages accepted by these machines strictly includes the regular class and constant sweep complexity allows exactly the acceptance of regular languages. However, we show that there exist machines with higher than constant complexity still only accepting regular languages and that in general the sweep complexity of an automaton does not distinguish between accepting regular and non-regular languages. We establish separation results for asymptotic classes defined by this complexity measure and give a surprising exponential/logarithmic relation between factors of certain inputs which can be verified by such machines.</p

    A toolkit for Parikh matrices

    No full text
    The Parikh matrix mapping is a concept that provides information on the number of occurrences of certain (scattered) subwords in a word. Although Parikh matrices have been thoroughly studied, many of their basic properties remain open. In the present paper, we describe a toolkit that has been developed to support research in this field. Its functionality includes elementary and advanced operations related to Parikh matrices and the recently introduced variants of P -Parikh matrices and L -Parikh matrices.</p
    corecore